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(N ' 

, The causal inference literature has provided a clear formal defi- 

f^ ' nition of confounding expressed in terms of counterfactual indepen- 

^M ^ dence. The literature has not, however, come to any consensus on 

^^ . a formal definition of a confounder, as it has given priority to the 

^S| ' concept of confounding over that of a confounder. We consider a 

number of candidate definitions arising from various more informal 
statements made in the literature. We consider the properties satis- 
fied by each candidate definition, principally focusing on (i) whether 
under the candidate definition control for all "confounders" suffices 
to control for "confounding" and (ii) whether each confounder in 
j^ I some context helps eliminate or reduce confounding bias. Several of 

^ , the candidate definitions do not have these two properties. Only one 

candidate definition of those considered satisfies both properties. We 
propose that a "confounder" be defined as a pre-exposure covariate 
C for which there exists a set of other covariates X such that effect of 
the exposure on the outcome is unconfounded conditional on (X, C) 
but such that for no proper subset of (X, C) is the effect of the expo- 
sure on the outcome unconfounded given the subset. We also provide 
^-^ ' a conditional analogue of the above definition; and we propose a vari- 

able that helps reduce bias but not eliminate bias be referred to as a 
"surrogate confounder." These definitions are closely related to those 
given by Robins and Morgenstern [Comput. Math. Appl. 14 (1987) 
869-916]. The implications that hold among the various candidate 
definitions are discussed. 
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1. Introduction. Statisticians and epidemiologists had traditionally con- 
ceived of a confounder as a pre-exposure variable that was associated with 
exposure and associated also with the outcome conditional on the exposure, 
possibly conditional also on other covariates [Miettinen (1974)]. The devel- 
opments in causal inference over the past two decades have made clear that 
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this definition of a "confounder" is inadequate: there can be pre-exposure 
variables associated with the exposure and the outcome, the control of which 
introduces rather than eliminates bias [Greenland, Pearl and Robins (1999), 
Glymour and Greenland (2008), Pearl (2009)]. The literature has moved 
away from formal language about "confounders" and instead places the con- 
ceptual emphasis on "confounding." See Morabia (2011) for historical dis- 
cussion of this point. The causal inference literature has provided a formal 
definition of "confounding" in terms of dependence of counterfactual out- 
comes and exposure, possibly conditional on covariates. The absence of con- 
founding (independence of the counterfactual outcomes and the exposure) 
has been taken as the foundational assumption for drawing causal inferences. 
Such absence of confounding is alternatively referred to as "ignorability" or 
"ignorable treatment assignment" [Rubin (1978)], "exchangeability" [Green- 
land and Robins (1986)], "no unmeasured confounding" [Robins (1992)], 
"selection on observables" [Barnow, Cain and Goldberger (1980), Imbens 
(2004)] or "exogeneity" [Imbens (2004)]. Today, at least within the formal 
methodological literature on causality, language concerning "confounders" is 
generally used only informally, if at all. The priority that has been given to 
"confounding" over "confounders" has arguably brought clarity and preci- 
sion to the field. Nevertheless, among practicing statisticians and epidemiol- 
ogists, language concerning both "confounders" and "confounding" is com- 
mon. This raises the question as to whether a formal definition of a "con- 
founder" can also be given within the counterfactual framework that coheres 
with how the word seems to be used in practice. 

In this paper we will consider various definitions of a confounder pro- 
posed either formally or informally by a number of prominent statisticians 
and epidemiologists. For each potential definition we will consider the prop- 
erties satisfied by the candidate definition. Specifically, we state and prove 
a number of propositions showing whether under each candidate definition 
(i) control for all "confounders" suffices to control for "confounding" and 
(ii) whether each confounder in some context helps eliminate or reduce con- 
founding bias. As we will see below, only one candidate definition of those 
considered satisfies both properties. We consider also the implications that 
hold between the various definitions themselves. 

2. Notation and framework. We let A denote an exposure, Y the out- 
come, and we will use C, S and X to denote particular pre-exposure co- 
variates or sets of covariates (that may or may not be measured). As noted 
in the penultimate section of the paper, the restriction to pre-exposure co- 
variates could, in the context of causal diagrams [Pearl (1995, 2009)], be 
replaced to that of nondescendents of exposure A. Within the counterfac- 
tual or potential outcomes framework [Neyman (1923), Rubin (1978)], we 
let Ya denote the potential outcome for Y if exposure A were set, possibly 
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contrary to fact, to the value a. If the exposure is binary, the average causal 
effect is given by E{Yi) — E{Yq). Note that the potential outcomes notation 
Ya presupposes that an individual's potential outcome does not depend on 
the exposures of other individuals. This assumption is sometimes referred 
to as SUTVA, the stable unit treatment value assumption [Rubin (1990)] or 
as a no-interference assumption [Cox (1958)]. 

We use the notation E AL F\G to denote that E is independent of F 
conditional on G. For exposure A and outcome Y, we say there is no con- 
founding conditional on S (or that the effect of j4 on y is unconfounded 
given S) if Y^ _LL ^|S. We will refer to any such 5 as a sufficient set or a 
sufficient adjustment set. If the effect of ^ on 1" is unconfounded given 5, 
then the causal effect can be consistently estimated by E(Yi) — E(Yq) = 
T,s{E(^\^ = l,s) - E{Y\A = 0,s)}pr(s) [Rosenbaum and Rubin (1983)]. 
We will say that S = {Si, . . . , S„) constitutes a minimally sufficient adjust- 
ment set if y^j -LL ^15* but there is no proper subset T of 5 such that 
Ya -LL A\T, where "proper subset" here is understood as T being a strict 
subset of the coordinates of 5" = (5i, . . . , Sn)- 

Some of the candidate definitions of a confounder below define "con- 
founder" in terms of "confounding" via reference to "sufficient adjustment 
sets" or "minimally sufficient adjustment sets." Such definitions give con- 
ceptual priority to "confounding," as has generally been done in the causal 
inference literature [Greenland and Robins (1986), Greenland and Morgen- 
stern (2001), Hernan (2008)]. Often after formal definitions of "confounding" 
are given, a "confounder" is defined as a derivative and sometimes informal 
concept. For example, in papers by Greenland, Pearl and Robins (1999) and 
Greenland and Morgenstern (2001), formal definitions are given for "con- 
founding" and then a "confounder" is simply described as a variable that is 
in some sense "responsible" [Greenland, Robins and Pearl (1999), page 33] 
for confounding. Although priority arguably has and should be given to the 
concept of "confounding" over "confounder," applied researchers will often 
use the word "confounder" to refer to a single variable that is perhaps a 
member of a sufficient adjustment set but does not by itself constitute a 
sufficient adjustment set and this raises the question of whether this use of 
"confounder" can be given a coherent definition within the counterfactual 
framework. 

Most of the definitions and properties we discuss make reference only to 
counterfactual outcomes. However, one of the definitions and several propo- 
sitions make reference to causal diagrams. We will thus restrict attention 
in this paper to causal diagrams. We review concepts and definitions for 
causal diagrams in the Appendix; the reader can also consult Pearl (1995, 
2009). For expository purposes we follow Pearl (1995), but the results in 
the paper are equally applicable to all of the alternative graphical causal 
models considered, for example, by Robins and Richardson (2010). In short, 
following Pearl (1995), a causal diagram is a very general data generat- 
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ing process corresponding to a set of nonparametric structural equations 
where each variable Xi is given by its nonparametric structural equation 
^i = fiipO'iy^i)^ where pai are the parents of Xi on the graph and the £i are 
mutually independent such that the structural equations encode one-step 
ahead counterfactual relationships among the variables with other counter- 
factuals given by recursive substitution [Pearl (1995, 2009)]. The assumption 
of "faithfulness" is said to be satisfied if all of the conditional independence 
relationships among the variables are implied by the structure of the graph; 
see the Appendix for further details. A backdoor path from ^ to y is a path 
to Y which begins with an edge into A. Pearl (1995) showed that if a set of 
pre-exposure covariates S blocks all backdoor paths from A to Y, then the 
effect of ^4 on y is unconfounded given S. 

The definitions given below will be stated formally in terms of potential 
outcomes and causal diagrams. It is assumed that there is an underlying 
causal diagram which may contain both measured and unmeasured vari- 
ables; all variables considered in the definitions are variables on the dia- 
gram. Whether a variable satisfies the criteria of a particular definition will 
be relative to the causal diagram. In Section 6 we will consider settings with 
multiple causal diagrams where one diagram may have variables absent on 
another. 

3. Candidate definitions for a confounder. Here we give a number of 
candidate definitions of a confounder motivated by statements made in the 
methodological literature. We will cite specific statements from the method- 
ologic literature; we do not necessarily believe these statements were in- 
tended as formal definitions of a "confounder" by the authors cited. We 
simply use these statements to motivate the candidate definitions. As noted 
above, we believe statements about "confounder s," as opposed to "confound- 
ing," have generally been used only informally and intuitively. 

As already noted, the traditional conception of a confounder in statistics 
and epidemiology has been a variable associated with both the treatment 
and the outcome. Miettinen (1974) notes that whether such associations 
hold will depend on what other variables are controlled for in an analysis. 
This motivates our first candidate definition for a confounder. 

Definition 1. A pre-exposure covariate C is a confounder for the effect 
of A on y if there exists a set of pre-exposure covariates X such that C /L 

A\X and C/LY\ {A,X). 

Definition 1 is essentially a generalization of the traditional conceptual- 
ization of a confounder. 

Pearl (1995) showed that if a set of pre-exposure covariates X blocks all 
backdoor paths from yl to y, then the effect of yl on y is unconfounded 
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given X. Hernan (2008) accordingly speaks of a confounder as a variable 
that "can be used to block a backdoor path between exposure and outcome" 
(page 355). A similar definition of a confounder is given in Greenland and 
Pearl [(2007), page 152] and in Glymour and Greenland [(2008), page 193]. 
This motivates a second candidate definition. 

Definition 2. A pre-exposure covariate C is a confounder for the effect 
of A on y if it blocks a backdoor path from AioY. 

The second definition is perhaps one that would arise most naturally 
within the context of causal diagrams; the definition itself of course presup- 
poses a framework of causal diagrams or variants thereof [Spirtes, Glymour 
and Scheines (1993), Dawid (2002)]. 

Pearl (2009) speaks of a confounder as "a variable that is a member of 
every sufficient [adjustment] set" (page 195), that is, control for it must 
be necessary. Likewise, Robins and Greenland (1986) write, "We will call a 
covariate a confounder if estimators which are not adjusted for the covariate 
are biased" (page 393) and Hernan (2008) speaks of a confounder as "any 
variable that is necessary to eliminate the bias in the analysis" (page 357). 
Note that a variable is a member of every sufficient adjustment set if and only 
if it is a member of every minimal sufficient adjustment set. This motivates 
our third candidate definition. 

Definition 3. A pre-exposure covariate C is a confounder for the effect 
of A on y if it is a member of every minimally sufficient adjustment set. 

Definition 3 captures the notion that controlling for a confounder might 
be necessary to eliminate bias. The definition makes reference to "every 
minimally sufficient adjustment set;" this will be relative to a particular 
causal diagram, a point to which we will return below. 

Kleinbaum, Kupper and Morgenstern (1982), in a textbook on epidemi- 
ologic research, gave as a definition of a "confounder" a variable that is 
"a member of a sufficient confounder group" where a sufficient confounder 
group is defined as "a minimal set of one or more risk factors whose si- 
multaneous control in the analysis will correct for joint confounding in the 
estimation of the effect of interest" (page 276). Kleinbaum, Kupper and 
Morgenstern (1982), however, define "confounding" in terms of association 
rather than counterfactual independence. As a variant of the Kleinbaum, 
Kupper and Morgenstern proposal, we could retain the definition "a mem- 
ber of a minimally sufficient adjustment set" but use the counterfactual 
definition of "confounding." This motivates the fourth candidate definition. 

Definition 4. A pre-exposure covariate C is a confounder for the effect 
of yl on y if it is a member of some minimally sufficient adjustment set. 
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Definition 4 can be restated as follows: a pre-exposure covariate C is a 
confounder for the effect of ^ on y if there exists a set of pre-exposure co- 
variates X (possibly empty) such that Ya 11 ^|(^, C) but there is no proper 
subset T of {X, C) such that Ya -LL A\T. Robins and Morgenstern (1987) and 
Dawid (2002) likewise conceive of a confounder in terms of the presence or 
absence of confounding in such a way that coincides with Definition 4 when 
there is a single confounder; when there are multiple sets that are sufficient 
or sets that are sufficient but not minimally sufficient, it is not clear how 
the definition of Dawid (2002) generalizes; the definitions of Robins and 
Morgenstern (1987) can be adapted to coincide with Definition 4. Robins 
and Morgenstern [(1987), Section 2H] say that C is a confounder condi- 
tional on F if causal effects are computable given data on C and F, but 
not on F alone. In the framework of Robins and Morgenstern, if one were 
to take as the (unconditional) definition of a confounder that "there exists 
some set F such that C is a confounder conditional on F [in the sense of 
Robins and Morgenstern (1987), Section 2H]," then this would coincide with 
Definition 4. 

Miettinen and Cook (1981) conceive of a confounder as any variable that 
is helpful in reducing bias. Hernan (2008) likewise speaks of a confounder 
as "any variable that can be used to reduce [confounding] bias" (page 355). 
Geng, Guo and Fung (2002) use a similar definition for confounding. As 
noted by other authors [Greenland and Morgenstern (2001), Hernan (2008)], 
whether a variable is helpful in reducing bias will depend on what other 
variables are being conditioned on in the analysis; a confounder should be 
helpful for reducing bias in some context. This motivates our fifth definition. 

Definition 5. A pre-exposure covariate C is a confounder for the ef- 
fect of ^ on y if there exists a set of pre-exposure covariates X such that 
\Y..c{E{Y\A = l,x,c) - E{Y\A = 0,x,c)}pr(x,c) - {E{Y^) - E{Y^)}\ < 
I Y.AE{y\A = l,x) - E{Y\A = 0,x)}pr(x) - {E{Yi) - E{Yo)}\. 

Definition 5 captures the notion that controlling for C along with X 
results in lower bias in the estimate of the causal effect than controlling for 
X alone. A number of variants of Definition 5 could also be considered. Geng, 
Guo and Fung (2002), for example, considered the analogous definition for 
the effect of the exposure on the exposed rather than the overall effect of 
the exposure on the population; one could likewise consider the analogue of 
Definition 5 for effects conditional on X rather than standardized over X or, 
alternatively, for different measures of effect, for example, risk ratios or odds 
ratios rather than causal effects on the difference scale. Definition 5, unlike 
other definitions, is inherently scale-dependent. Thus, under Definition 5, a 
variable C might be a confounder for Y but not for log(y) or vice versa. This 
is an important limitation of Definition 5. Note, however, that some authors 
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also consider "confounding" to be scale-dependent [Greenland and Robins 
(1986, 2009), Greenland and Morgenstern (2001)] and use "ignorability" to 
refer to the notion of unconfoundedness in the distribution of counterfactuals 
as given above. 

Confounders have also sometimes been defined in terms of empirical col- 
lapsibility [Miettinen (1976), Breslow and Day (1980)], that is, if one obtains 
the same estimate with or without adjustment for a variable, then it is not 
a confounder. In the applied literature the approach is sometimes encapsu- 
lated in the "10 percent rule," that is, discard a covariate if adjustment for it 
does not change an estimate by more than 10 percent. It is well documented 
in the literature that collapsibility-based definitions do not work for all ef- 
fect measures, such as the odds ratio or hazard ratios, for which marginal 
and conditional may differ even in the absence of confounding [Greenland, 
Robins and Pearl (1999)]. Such effect measures are sometimes referred to as 
noncollapsible. However, for at least the risk difference scale (or the risk ratio 
scale) a collapsibility-based definition of a confounder could be entertained 
and for completeness we consider it also here. Such a collapsibility-based 
definition could be formalized as follows. 

Definition 6. A pre-exposure covariate C is a confounder for the ef- 
fect of A on y if there exists a set of pre-exposure covariates X such that 
T..A^{Y\A = l>^>c) - E{Y\A = 0,rE,c)}pr(:E,c) / Y.AE{Y\A = l,x) - 
E{Y\A = ^,x)}^i{x). 

Definition 6, like Definition 5, is scale-dependent. 

Although not the focus of the present paper, in the Appendix we give 
some further remarks on the possibility of empirical testing for each of Defini- 
tions 1-6 and for confounding and nonconfounding more generally. However, 
for the most part, notions of confounding and confounders, under these six 
definitions, are not empirically testable without further experimental data 
or strong assumptions. 

4. Properties of a confounder. Language about "confounders" occurs of 
course not simply in methodologic work but in substantive statistical and 
epidemiologic research. In the design and analysis of observational studies in 
the applied literature the task of controlling for "confounding" is often con- 
strued as that of collecting data on and controlling for all "confounders." In 
this section we propose that when language about "confounders" is generally 
used in statistics and epidemiology, two things are implicitly presupposed: 
first, that if one were to control for all "confounders," then this would suffice 
to control for "confounding" and, second, that control for a "confounder" 
will in some sense help to reduce or eliminate confounding bias. We would 
propose that if a formal definition is to be given for a "confounder," it should 
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in some sense satisfy these two properties. If it does not, it arguably does 
not cohere with what is typically presupposed when language about "con- 
founders" is used in practice. We give a formalization of these two properties 
and in the following section we will discuss which of these two properties 
are satisfied by each of the candidate definitions of the previous section. 
We could formalize the first property as follows. 

Property 1 . If 5 consists of the set of all confounders for the effect of 
AonY, then there is no confounding of the effect of j4 on y conditional on 
S, that is, YalLA\S. 

The definition makes reference to "all confounders;" to make reference 
to all such variables, the domain of the variables considered needs to be 
specified. The domain here will be all pre-exposure variables on a particular 
causal diagram that qualify as confounders according to whatever definition 
is in view. See Section 6 for some extensions. 

The second property is that control for a confounder should help either 
reduce or eliminate bias. The reduction and the elimination of bias are not 
equivalent and, thus, we will formally give two alternative properties, 2A 
and 2B. 

Property 2A. If C is a confounder for the effect of A on Y, then 
there exists a set of pre-exposure covariates X (possibly empty) such that 
Ya AL A\{X, C) but YaJlA\X. 

Property 2B. If C is a confounder for the effect of A on Y, then 
there exists a set of pre-exposure covariates X (possibly empty) such that 
\E.c{E{Y\A = l,x,c) - E{Y\A = 0,x,c)}pr(x,c) - {E{Yi) - EiYo)}\ < 
I EAE{Y\A = l,x)- E{Y\A = 0,x)}pr(x) - {E{Yi) - E{Yo)}\. 

Property 2A captures that notion that in some context, that is, condi- 
tional on X , the covariate C helps eliminate bias. Property 2B captures the 
notion that in some context, that is, conditional on X, the covariate C helps 
reduce bias. Note that Property 2B, like Definition 5, is inherently scale- 
dependent and in this sense perhaps less fundamental than Property 2A. 
For now we simply propose that for a candidate definition of a confounder 
to adequately capture the intuitive sense in which the word is used, it should 
satisfy Property 1 and should also satisfy either Property 2A or 2B. It would 
be peculiar if a confounder were defined in a way that it did not satisfy these 
two properties. In the next section we consider whether each of the candidate 
definitions. Definitions 1-6, satisfy Properties 1, 2A and 2B. Of course, one 
possible outcome of this exercise is that none of the candidate definitions 
satisfy Property 1 and either Properties 2A or 2B (or even that no candidate 
definition could). However, as we will see in the next section, this turns out 
not to be the case. 
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Fig. 1. Definition 1 does not satisfy Property 2 A or 2B. 

5. Properties of the candidate definitions. Definition 1 was a generaliza- 
tion of tlie traditional epidemiologic conception of a confounder as a variable 
associated with exposure and outcome. For this definition we have the fol- 
lowing result. 

Proposition 1. Under faithfulness, for every causal diagram, Defini- 
tion 1 satisfies Property 1. Definition 1 does not satisfy Properties 2 A or 2B. 

Proof. We first show that Definition 1 satisfies Property 1 in faithful 
models. 

Let G* = GN(j(yi)uAn(y) be the subgraph of G that has only the nodes 
in Nd(^) or An(y); see the Appendix. Let Pa* be the subset of Pa(74) in 
G* such that every element P G Pa* contains some path in G* to Y not 
through A. Since we consider faithful models, we can use d-connectedness 
to represent dependence. First we note that every element in Pa* satisfies 
Definition 1. Indeed, any element of Pa(A) is dependent on A conditioned on 
any set. For any member of Pa*, we fix some path vr to y (not through A). 
We are now free to pick any set X to make this path d-connected (e.g., we 
can pick the smallest X that opens all colliders in vr). This set X satisfies 
Definition 1 for Pa* with respect to A and Y . Thus, the set of all nodes 
in Nd(j4) satisfying Definition 1 will include Pa*. Next, we show that any 
superset of Pa* in Nd(A) will be a valid adjustment set for {A,Y). Assume 
this is not the case for a particular 5, and fix a backdoor path from AtoY 
which is open given 5. Then the first node on this path after A must be in 
Pa*. But this means the path is blocked by S. Our conclusion follows. 

We now show Definition 1 does not satisfy Properties 2 A or 2B. Consider 
the causal diagram in Figure 1. The variable C3 is unconditionally associated 
with A and Y] the variables Ci and C2 are each associated with A and 
Y conditional on C3. Thus, under Definition 1, all three would qualify as 
"confounders." However, there is no set of pre-exposure covariates X on 
the graph such that control for C3 helps eliminate or reduce bias. To see 
this, note that if X includes Ci or C2, then the effect estimate is unbiased 
irrespective of whether adjustment is made for C3. If X includes neither Ci 
nor C2, then the estimand without adjustment for C3 is unbiased whereas 
the estimand adjusted for C3 is not. Therefore, Definition 1 does not satisfy 
Properties 2A or 2B. This completes the proof. D 
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Fig. 2. Definition 2 does not satisfy Property 2 A or 2B. 

Intuitively, Definition 1 does not satisfy Properties 2A or 2B because in 
the causal diagram in Figure 1, the variable C3 is unconditionally associated 
with A and Y and thus would be a confounder under Definition 1, but control 
for it will only either not affect bias (if control is not made for Ci and C2) or 
increase bias (if control is not made for Ci and C2). The causal structure in 
Figure 1 and the bias resulting from controlling for C3 is sometimes referred 
to in the literature as "M-bias" or "collider-stratification" [Greenland (2003), 
Hernan et al. (2002), Hernan (2008)]. We note that if faithfulness is violated. 
Definition 1 does not satisfy Property 1 either [Pearl (2009)]. 

Under Definition 2, a confounder was defined as a pre-exposure covariate 
that blocks a backdoor path from A to Y. 

Proposition 2. For every causal diagram, Definition 2 satisfies Prop- 
erty 1. Definition 2 does not satisfy Properties 2 A or 2B. 

Proof. If S consists of the set of all confounders under Definition 2, 
then this set S will include all pre-exposure covariates that block a backdoor 
path from A to Y . From this it follows that S blocks all backdoor paths 
from ^ to y and by Pearl's backdoor path theorem, the effect of ^ on y is 
unconfounded given S. Thus, Definition 2 satisfies Property 1. 

We now show that it does not satisfy Properties 2A and 2B. Consider 
the causal diagram in Figure 2. Under Definition 2 both C\ and C2 block 
a backdoor path from AtoY and thus would qualify as confounders. How- 
ever, for C2 there is no set of pre-exposure covariates X on the graph such 
that control for C2 helps eliminate since if X = Ci, there is no bias with- 
out controlling for C2; if X = 0, there is bias even with controlling for 
C2. Thus, Definition 2 does not satisfy Property 2A. We now show that 
it does not satisfy Property 2B. Suppose Figure 2 is a causal diagram for 
(Ci,C2, A,y) where all variables are binary and suppose that P{C-i = 1) = 
1/2, P{C2 = l|ci) = 1/5 + 3ci/5, P{A = l|ci,C2) = 1/10 + 3ci/5 + C2/IO, 
P{Y = l|a, ci , C2) = 1/2 + (l/2)(a - l/2)ci . One can then verify that E{Yi) - 
E{Yo) = Eci cAE{Y\A = l,ci,C2) - E{Y\A = 0,ci,C2)}pr(ci,C2) = 0.25 = 
EcAEiYlA = 1, ci) - E{Y\A = 0, ci)} pr(ci), that E{Y\A = 1) - E{Y\A = 
0) = 0.266 and that Y^cii^i^l^ = I.C2) - E{Y\A = 0,C2)}pr(c2) = 0.269. 
Under Definition 2, C2 would be considered a confounder since C2 blocks 
the backdoor path A -(^ C2 -^ Ci —^ Y . However, there is no set X of pre- 
exposure covariates such that | X^x C2'f^(-^l^ ~ ^,x,C2) — E{Y\A = 0,x, 
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Fig. 3. Definition 3 does not satisfy Property 1. 

C2)}pr(x,C2) - {E{Yi) - E{Y^)}\ < \EAE{Y\A = l,x) - E{Y\A = 
0,x)}pr(x) — {E{Yi) — E{Yq)}\. This is because if X is taken as Ci, then 
the expressions on both sides of the inequaUty are equal to (controlHng for 
C2 in addition to Ci does not reduce bias); if X is taken as the empty set, 

wehave|Ec2{^(^l^ = l.C2)-^(l^|^ = 0,C2)}pr(c2)-{^(yi)-^(yo)}| = 
10.269 - 0.2501 = 0.019 > 0.016 = |0.266 - 0.250| = \{E{Y\A = 1) - E{Y\A = 
0)} — {£'(11) — -E(yb)}| and again controlhng for C2 does not reduce (but 
rather increases) bias. Definition 2 thus does not satisfy Property 2B. This 
completes the proof. D 

If we consider the causal diagram in Figure 2, then under Definition 2 
both Ci and C2 block a backdoor path from A to y and thus would qualify 
as confounders. However, for C2 there is no set of pre-exposure covariates 
X on the graph such that control for C2 helps eliminate bias (Property 2A) 
since if X = Ci , there is no bias without controlling for C2 ; if X = , there 
is bias even with controlling for C2. Likewise, examples can be constructed 
as in the proof above in which control for C2 will only increase bias, that is, 
control for C2 does not help reduce bias (Property 2B). 

Under Definition 3, a confounder was defined as a member of every min- 
imally sufficient adjustment set. 

Proposition 3. Definition 3 does not satisfy Property 1. Definition 3 
satisfies Property 2A. 

Proof. Consider the causal diagram in Figure 3. Here, either C\ or C^ 
would constitute minimally sufficient adjustment sets and thus neither are a 
member of every minimally sufficient adjustment set and under Definition 3, 
neither would be confounders. If we control for nothing, there is still con- 
founding for the effect of ^ on y and, thus, for Figure 3, controlling for all 
confounders under Definition 3 would not suffice to control for confounding. 
Thus, Definition 3 does not satisfy Property 1. If C is a member of every 
minimally sufficient adjustment set, then it is a member of a minimally suf- 
ficient adjustment set and from this it trivially follows that it satisfies the 
requirements in Property 2A. This completes the proof. D 

A variable C that is a confounder under Definition 3 will in general sat- 
isfy Property 2B as well but may not always because there are cases in 
which there is confounding in the distribution of counterfactual outcomes 
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conditional on C and so that C is a confounder under Definition 3 but with 
the average causal effect on the additive scale not confounded [Greenland, 
Robins and Pearl (1999)]. Intuitively, to see that Definition 3 does not sat- 
isfy Property 1, consider the causal diagram in Figure 3. Here, either Ci or 
C2 would constitute minimally sufficient adjustment sets and thus neither 
are a member of every minimally sufficient adjustment set. Under Defini- 
tion 3, there would thus be no confounders for the effect oiAonY; clearly, 
however, if we control for nothing, there is still confounding for the effect of 
A on Y. 

Under Definition 4, a confounder was defined as a member of some mini- 
mally sufficient adjustment set. 

Proposition 4. For every causal diagram, Definition 4 satisfies Prop- 
erty 1. Definition 4 satisfies Property 2 A. 

Proof. We will show that Definition 4 satisfies Property 1. We first 
claim that any minimally sufficient adjustment set for (^,5^) must lie in 
G'An(A)uAn(y)i the subgraph of G that has only the nodes in Nd(A) or 
An(y); see the Appendix. Assume this is not true, and pick some min- 
imally sufficient set S with elements outside An(A) U An(y). This means 
S"!"! (An(A) U An(y)) is not sufficient. Note that any ancestor of a node in the 
set An(A) U An(y) will also be in An(A) U An(y). From this it follows that 
any backdoor path from A'loY which has a node outside An(j4) U An(y) will 
require a collider to get back into An(j4) U An(y). However, those colliders 
must be open by elements in S. We have a contradiction. We have shown that 
any minimally sufficient adjustment set must be a subset of An(A) U An(y) 
and, thus, any variable that is a confounder under Definition 4 must be in 
An(A)UAn(y). 

Next we note that Pa(A) is a sufficient adjustment set for (^,y). Pick 
a minimal subset Pa"*" of Pa(j4) that is sufficient. Our claim is that every 
element P in Pa(^) \ Pa"*" is such that P is not connected to Y in the 
graph (GAn(A)uAn{y))a except by paths that are blocked conditional on Pa"*". 
Assume this is not true, and fix a path uj from P to y that is not blocked 
by Pa^ in (G'An(A)uAn(y))a- If this path has no colliders, then appending uj 
with the edge P ^ A produces a backdoor path from A to y not blocked 
by Pa"^, contradicting the earlier claim that Pa^ is a valid adjustment set. 

If UJ only contains colliders ancestral of Pa"*", then either uj has a non- 
collider triple blocked by Pa"*" (in which case we are done with that path) 
or u appended with P ^ A produces a backdoor path open conditional 
on Pa"*", which is a contradiction. If uj contains collider triples ancestral of 
Pa(A) \ Pa"*" (but not ancestral of Pa"*"), let W be the central node of the 
last such collider triple on the path from P to y. Let P' be a member of 
Pa(A) \ Pa"^ of which W is an ancestor. Consider instead of w a new path: 
A^ P' ^ ■ ■ ■ -(^W appended with the subpath of uj that begins with the 
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node on cj after W and ends with Y . This path either has a noncohider 
triple blocked by Pa"*" (in which case so does co and we are done with u) or 
it is open conditional on Pa"*", in which case we have a contradiction, or it 
contains collider triples ancestral of Y not through Pa(A). In the last case, 
let Z be the central node of the first such collider triple on the currently 
considered path from A to Y. Consider instead a new path which appends 
a subpath of the currently considered path extending from A to Z, and the 
segment Z —^ ■ ■ ■ ^Y. This path has no blocked colliders by construction, 
and thus must either have a noncollider triple blocked by Pa^ (in which 
case so does w and we are done with (xj) or it is open conditional on Pa"^, in 
which case we have a contradiction. 

Our final claim is that any superset S of Pa"*" in Nd(^) fl (An(A) U An(Y)) 
is a valid adjustment set for (A, Y). Assume this were not so and fix an open 
backdoor path p from A to Y given S. The first node on p after A must lie 
either in Pa"*" or in Pa{A) \Pa^. In the first case, the path is blocked. In the 
second case, we have shown above that every path from Pa(^) \ Pa"*" to Y 
in (GAn(A)uAn(y))a IS blocked by Pa"^ and, thus, the path must be blocked 
in the second case as well. There thus cannot be an open backdoor path 
from A to Y given S and we have a contradiction. We have that Pa~^ is a 
sufficient adjustment set; any variable that is a confounder under Definition 4 
will be a member of Nd(74) n (An(^) U An(y)) and, thus, we have that the 
set of variables that are confounders under Definition 4 will be a sufficient 
adjustment set. Definition 4 thus satisfies Property 1. Definition 4 satisfies 
Property 2A trivially. This completes the proof. D 

A variable that is a confounder under Definition 4 will in general satisfy 
Property 2B as well but may not always because, as before, there may be 
confounding in distribution without the average causal effect on the additive 
scale being confounded. Definition 4 thus satisfies Property 2A, generally 
Property 2B, and, as shown in the proof above, also satisfies Property 1 for 
all causal diagrams. That Definition 4 satisfies Property 1 can be restated 
as the proposition that the union of all minimally sufficient adjustment sets 
is itself a sufficient adjustment set. Definition 4 thus satisfies the proper- 
ties which arguably ought to be required for a reasonable definition of a 
"confounder." 

Under Definition 5, a confounder was essentially defined as a pre-exposure 
covariate, the control for which helped reduce bias. 

Proposition 5. Definition 5 does not satisfy Property 1. Definition 5 
satisfies Property 2B hut not 2A. 

Proof. Suppose that Ya 11 A\C , that (C, A, Y) are all binary and that 
P{C = 1) = 1/2, P{A = l|c) = 1/4 + c/2, P{Y = l|a,c) = 4/10 - 4c/10 - 
3a/10 + 8ac/10. One can then verify that E{Yi) = Ec^(^l^ = 1' c) pr(c) = 
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Fig. 4. Definition 5 does not satisfy Property 2A. 

3/10, E{Y\A = 1) = 4/10, ^(yo) = Y.cE{Y\A = 0, c) pr(c) = 2/10, E{Y\A = 
0) = 3/10. Thus, |Ec{-^(^l^ = I'C) - E{Y\A = 0,c)}pr(c) - 
{£;(yi) - E{Y^)}\ = = \{E{Y\A = 1) - £;(y|A = O) - {£;(yi) - E{Y^)}\ and 
so under Definition 5, C would not be a confounder. The set of variables 
defined as confounders under Definition 5 would thus be empty. However, 
it is not the case that adjustment for the empty set suffices to control for 
confounding since, for example, EiYi) = 3/10 ^ 4/10 = £^(y|A = 1). Thus, 
Definition 5 does not satisfy Property 1. We now show that Definition 5 does 
not satisfy Property 2A. Consider the causal diagram in Figure 4. Although 
control for C2 might reduce bias compared to an unadjusted estimate and 
thus satisfy Definition 5 with X = 0, there is no X such that the effect of 
^ on y is unconfounded conditional on (X, C2) but not on X alone. Thus, 
Definition 5 does not satisfy Property 2A. Definition 5 satisfies Property 2B 
trivially. This completes the proof. □ 

Definition 5 does not satisfy Property 1 because an unadjusted estimate 
of the causal risk difference may be correct, even in the presence of con- 
founding, because the bias due to confounding for E{Yi) may cancel that 
for EiYo); said another way, there may be confounding in the distribution 
of counterfactual outcomes without their being confounding in a particular 
measure. That Definition 5 satisfies Property 2B is essentially embedded 
in Definition 5 itself. Intuitively, to see that Definition 5 does not satisfy 
Property 2A, consider the causal diagram in Figure 4. Although control for 
C2 might reduce bias compared to an unadjusted estimate and thus satisfy 
Definition 5 with X = 0, there would be no X such that the effect of A on 
Y is unconfounded conditional on (X, C2) but not on X alone. 

Under Definition 6, a confounder was defined as a pre-exposure covariate, 
the control for which in some context changed the effect estimate. 

Proposition 6. Definition 6 does not satisfy Property 1. Definition 6 
does not satisfy Properties 2A or 2B. 

Proof. In the first example in the proof of Proposition 5, the set of 
confounders under Definition 6 would be empty because with X empty we 
have E^ c{^(y I A = 1, X, c) - E{Y\A = 0, x, c)} pr(x, c) = = E J^(^l^ = 
l,x) — £'(^1^4 = 0,x)}pr(3;). However, the effect of A on Y is not uncon- 
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founded conditional on the empty set. Thus, Definition 6 does not satisfy 
Property 1. 

We now show Definition 6 does not satisfy Properties 2 A or 2B. Consider 
the causal diagram in Figure 1. If we let X denote the empty set, then C3 
will satisfy Definition 6 and so would be a confounder under Definition 6. 
However, if we consider Properties 2A and 2B, there is no set of pre-exposure 
covariates X on the graph such that control for C3 helps eliminate or reduce 
bias. To see this, note that if X includes Ci or C2, then the effect estimate 
is unbiased irrespective of whether adjustment is made for C3. If X includes 
neither Ci nor C2, then the estimand without adjustment for C3 is unbiased 
whereas the estimand adjusted for C3 is not. Therefore, Definition 1 does 
not satisfy Properties 2 A and 2B. This completes the proof. D 

As with Definition 5, Definition 6 does not satisfy Property 1 because of 
the possibility of cancellations: there may be confounding in the distribution 
of counterfactual outcomes without their being confounding in a particular 
measure. Definition 6 also fails to satisfy Properties 2A or 2B. It fails because 
of the possibility of "M-bias" or "collider-stratification" structures as in 
Figure 1 [Greenland (2003), Hernan et al. (2002)]. Controlling for a variable 
such as C3 may change the estimate, but it may be that it is the estimate 
without control for that variable (e.g., C3 in Figure 1) that is unbiased. 
Also, as noted above, the coUapsibility-based definitions fail for odds ratio 
and hazard ratio measures for others reasons, namely, because marginal and 
conditional measures are not comparable even in the absence of confounding. 
See Greenland, Robins and Pearl (1999), Geng et al. (2001) and Geng and 
Li (2002) for further discussion of the relationship between, and general 
nonequivalence of, confounding and collapsibility. 

Candidate definitions for a confounder might thus include Definition 4 
and, if the issue of scale dependence is set aside. Definition 5. Note, how- 
ever, that a variable that satisfies Definition 5 but not Definition 4 will never 
help to eliminate confounding bias, only to reduce such bias. Such a vari- 
able reduces bias essentially by serving as a proxy for a variable that does 
satisfy Definition 4. We therefore propose that a confounder be defined as in 
Definition 4, "a pre-exposure covariate that is a member of some minimally 
sufficient adjustment set" and that any variable that satisfies Definition 5 
but not Definition 4 be referred to as a "surrogate confounder." The termi- 
nology of a "surrogate confounder" or "proxy confounder" appears elsewhere 
[Greenland and Morgenstern (2001), Hernan (2008)]; here we have provided 
a formal criterion for such a "surrogate confounder." See Greenland and 
Pearl (2011) and Ogburn and VanderWeele (2012) for properties of such 
surrogate confounders. 

Interestingly, Definition 4 is closely related to definitions concerning con- 
founders proposed by Robins and Morgernstern (1987), though their defini- 
tions were not universally adopted by the epidemiologic community over the 
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ensuing 25 years. Robins and Morgenstern (1987) were not principally con- 
cerned with how the word "confounder" is employed in practice when used 
in an unqualified sense, but rather with whether a particular variable would 
still, in some sense, be a confounder if data were also available on other 
variables. As noted above, Robins and Morgenstern [(1987), Section 2H] say 
that C is a confounder conditional on F if causal effects are computable 
given data on C and F, but not on F alone. In the framework of Robins 
and Morgenstern, if one were to take as the (unconditional) definition of a 
confounder that "there exists some set F such that C is a confounder con- 
ditional on F [in the sense of Robins and Morgenstern (1987), Section 2H]," 
then this would coincide with Definition 4. Note that Robins and Morgen- 
stern, in their definitions, in some sense go further than Definition 4 in 
having the investigator explicitly specify the other variables F for which 
control might be made. This would indeed be useful in practice, though cur- 
rent use of language has not generally adopted this convention. It might in 
the future be helpful to distinguish between the unqualified use of the word 
^^ confounder" as defined in Definition 4, and ^^ confounder in the context of 
having data also on F" as in Robins and Morgenstern (1987). The former 
is arguably how the word "confounder" is often used in practice; the latter 
would be useful in making decisions about data collection and confounder 
control. 

6. Some extensions, implications and further results. In the discussion 
above we have considered whether a covariate is a "confounder" in an un- 
conditional sense. However, we might also speak about whether a variable 
C is a confounder for the effect of ^4 on 1" conditional on some set of covari- 
ates L which an investigator is going to condition on irrespective of whether 
control is made for C. Definition 4 above, the definition for an "uncondi- 
tional confounder" could be restated as follows: a pre-exposure covariate C 
is a confounder for the effect of ^ on y if there exists a set of pre-exposure 
covariates X such that Ya IL A\{X,C) but there is no proper subset T of 
(X, C) such that Ya -LL A\T. The conditional analogue would then be as fol- 
lows: we say that a pre-exposure covariate C is a confounder for the effect 
of A on y conditional on L if there exists a set of pre-exposure covariates 
X such that Ya IL A\{X,L,C) but there is no proper subset T of {X,C) 
such that Ya IL A\(T,L). Consider again the causal diagram in Figure 3. 
Here, C2 would be a confounder under Definition 4. However, C2 is not a 
confounder for the effect of A on Y conditional on L = Ci . Consider once 
more the causal diagram in Figure 1. Here, neither Ci nor C2 would be a 
confounder under Definition 4. However, conditional on L = C3, both Ci 
and C2 would be confounders. 

An analogue of Definition 4 could also be given for a particular causal 
parameter of interest rather than for the condition of nonconfounding in 
distribution Y^ _LL jdlS". For example, C could be defined to be a confounder 
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for a particular causal parameter (e.g., the causal risk difference or causal 
risk ratio) if there exists a set of pre-exposure covariates X such the pa- 
rameter is identified by adjusting for (X, C) and if for no proper subset, 
T of {X,C) is the parameter identified by adjusting for T [cf. Robins and 
Morgenstern (1987)]. However, when we restrict attention to particular pa- 
rameters we reintroduce some of the complications with cancellations that 
were noted above. For example, due to cancellations, a variable C may be 
a confounder for the causal risk difference but not for the causal risk ratio 
[cf. VanderWeele (2012)]. 

We have restricted our attention in this paper thus far to pre-exposure co- 
variates as potential confounders. We have done so in order to correspond as 
closely as possible to the discussion in the epidemiologic and potential out- 
comes literatures. However, within the context of causal diagrams, a some- 
what broader range of variables could be considered as "confounders" in that 
all of the discussion above is applicable if we consider all nondescendents of 
A as potential confounders rather than simply considering pre-exposure co- 
variates. 

Throughout the paper we have given all definitions with respect to a 
particular underlying causal diagram. However, for a given exposure A and 
a given outcome Y , there will be multiple causal diagrams that correctly 
represent the causal structure relating these variables to one another and 
to covariates. One diagram may be an elaboration of another and contain 
variables that the other does not. It is straightforward to verify that if a 
variable C is classified as a confounder under Definitions 1, 2, 4, 5 or 6, 
then C will also be a confounder under each of those definitions respectively 
on any expanded causal diagram with additional variables. In the case of 
Definition 1, this is because associations that hold conditional on covari- 
ates X for one diagram will clearly also hold for the other. In the case of 
Definition 2, if C blocks a backdoor path on one causal diagram, it will 
block a backdoor path on any larger diagram that also correctly describes 
the causal structure. In the case of Definition 4, if there is some minimally 
sufficient adjustment set S of which C is a member, then that set will also 
be minimally sufficient on any larger diagram that also correctly describes 
the causal structure. In the case of Definitions 5 and 6, if the inequalities 
in these definitions hold for some covariate set X for one diagram, they will 
clearly also hold for the other. Only Definition 3 does not share this prop- 
erty. To see this, consider Figure 3; if in Figure 3, we collapsed over C2 so 
that the causal diagram involved only Ci, A and Y , then Ci would be a 
member of every minimally sufficient adjustment set for this diagram and 
thus a confounder under Definition 3. However, as we saw above, Ci is not 
a confounder under Definition 3 for Figure 3 itself which includes the extra 
variable C2. This failure is a serious problem with Definition 3, but, as we 
also saw above, Definition 3 suffers from other limitations as well. 
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Several fairly trivial implications follow from Definition 4 and may be 
worth noting for the sake of completeness. First, if a causal diagram had a 
variable C with an arrow to log(C) (or vice versa) and if C were a member 
of a minimally sufficient adjustment set, then, under Definition 4, both C 
and log(C) would be considered "confounders," though log(C) would not be 
a confounder conditional on C, and likewise C would not be a confounder 
conditional on log(C). We believe that this is in accord with epidemiologic 
usage, though it would be peculiar to consider both C and log(C) simul- 
taneously, just as it would be peculiar to include both C and log(C) on a 
causal diagram. Second, if a variable C is measured with error, taking value 
C*, and if the measurement error term e = C* — C were also represented 
on the causal diagram, then, if C were a confounder under Definition 4, C* 
and £ would also both be confounders under Definition 4. We believe this is 
also in accord with standard epidemiologic usage of "confounder," though 
we would in practice rarely refer to e as a "confounder" since we rarely have 
access to e. Once again, however, neither C* nor e would be confounders 
conditional on C. Finally, suppose Ci were height in meters and C2 were 
weight in kilograms and that Ci and C2 together sufficed to control for con- 
founding but neither alone did; let C3 = Ci/Cf be body mass index (BMI) 
and suppose that controlling for C3 alone sufficed to control for confounding. 
Then under Definition 4, Ci, C2 and C3 would each be confounders, though 
C3 would not be a confounder conditional on (Ci,C2) and likewise neither 
Ci nor C2 would be a confounder conditional on C3. Once again, we believe 
this is in accord with traditional epidemiologic usage of "confounder." 

Several implications hold between the different definitions of a confounder 
as stated in the following result. 

Proposition 7. On a causal diagram, if a variable is a confounder un- 
der Definition 3, then it is a confounder under Definitions 4, 2 and 1; if 
under Definition 4, then under Definitions 2 and 1; if under Definition 5, 
then under Definitions 6 and 1; if under Definition 6, then under Defini- 
tion 1. No other implications hold without further assumptions. 

Proof. On a causal diagram, if a variable is a member of every min- 
imally sufficient adjustment set, it must be a member of a minimally suf- 
ficient adjustment set (the existence of a minimally sufficient adjustment 
set is guaranteed by the variables lying on a causal diagram). Thus, if a 
variable is a confounder under Definition 3, then it is a confounder under 
Definition 4. Suppose a variable C satisfies Definition 4, that is, is a mem- 
ber of some minimally sufficient adjustment set {X, C), but that it does not 
satisfy Definition 2, that is, it is not on a backdoor path from ^4 to y. By 
Theorem 5 of Shpitser, VanderWeele and Robins (2010), {X,C) blocks all 
backdoor paths from j4 to y. If C does not lie on a backdoor path from 
A to Y , then X alone would block all backdoor paths from A to Y , which 
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would contradict that (X, C) is a minimally sufficient adjustment set. Thus, 
if C is a confounder under Definition 4, it is a confounder under Defini- 
tion 2. That C being a confounder under Definition 4 implies C is a con- 
founder under Definition 1 follows from the contrapositive of Corollary 4.1 
of Robins (1997). If C is a confounder under Definition 5, it must be a con- 
founder under Definition 6 because the only way C can be a confounder 
under Definition 5 is if Ex,c{^(^l^ = 1'^'^) - E{Y\A = 0,x,c)}pr{x,c) 
and ^xi^O^l^ = 1)3;) — E{Y\A = 0,x)}pr(x) are not equal. If C is not a 
confounder under Definition 1, then for every X, C is independent of Y 
conditional on {A, X) or of A conditional on X and from this it easily fol- 
lows that Ea;,c{^(^l^ = 1' X, c) - E{Y\A = 0, X, c)} pr(x, c) = Y.x{E{Y\A = 
l,x) — ii^(y|A = 0,x)}pr(x) and thus that C is not a confounder under 
Definition 6. Thus, if C is a confounder under Definition 6, it must be a 
confounder under Definition 1. 

We now argue that without further assumptions no other implications 
between the definitions hold. The variable C2 in Figure 4 could satisfy Def- 
inition 1 but does not satisfy Definition 2, so Definition 1 does not imply 
Definition 2. The variable C3 in Figure 1 could satisfy Definition 1, but does 
not satisfy Definitions 3, 4 or 5; thus. Definition 1 does not imply Defini- 
tions 3, 4 or 5. If C is a confounder under Definition 1, in general it will be 
under Definition 6 as well, but it may not because of cancellations due to 
scale-dependence. 

If C satisfies the conditions for Definition 2 (i.e., lies on a backdoor path 
from AioY)^ it will generally do so for Definitions 1 and 6 but may fail to do 
so because of failure or faithfulness or cancellations due to scale-dependence. 
In the example given concerning Property 2B in Proposition 2, the variable 
C2 in Figure 2 satisfied Definition 2 but does not satisfy Definitions 3, 4 
or 5; thus. Definition 2 does not imply Definitions 3, 4 or 5. 

It was shown above that if C satisfies the conditions for Definition 3, it will 
satisfy the conditions for Definitions 4, 2 and 1. If C satisfies the conditions 
for Definition 3, it will generally satisfy the conditions for Definitions 5 and 6, 
but it may not do so due to scale-dependence. 

It was shown above that if C satisfies the conditions for Definition 4, it 
will satisfy the conditions for Definitions 2 and 1. In Figure 3, C2 satisfies 
the conditions for Definition 4 but not Definition 3, therefore. Definition 4 
does not imply Definition 3. If C satisfies the conditions for Definition 4, it 
will generally satisfy the conditions for Definitions 5 and 6, but it may not 
do so due to scale-dependence. 

It was shown above that if C satisfies the conditions for Definition 5, 
it will satisfy the conditions for Definitions 6 and 1. In the example given 
concerning Property 2B in Proposition 5, the variable C2 in Figure 4 satisfied 
Definition 5 but does not satisfy Definitions 2, 3 or 4; thus, Definition 5 does 
not imply Definitions 2, 3 or 4. 
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Fig. 5. Logical relationships that hold among definitions. Dashed arrows indicate impli- 
cations that will generally hold but may fail due to scale dependence of definitions. 

It was shown above that if C satisfies the conditions for Definition 6, it 
will satisfy the conditions for Definition 1. The variable C2 in Figure 4 could 
satisfy Definition 6 but does not satisfy Definition 2, so Definition 6 does not 
imply Definition 2. The variable C3 in Figure 1 could satisfy Definition 6, 
but does not satisfy Definitions 3, 4 or 5; thus, Definition 6 does not imply 
Definitions 3, 4 or 5. D 

The implications between the definitions are plotted in Figure 5. Those 
implications that will generally hold but may not hold because of cancella- 
tions due to scale-dependence are indicated with dashed arrows. 

The properties themselves that we have been considering also bear cer- 
tain relations to one another insofar as it is not difficult to show that if 
Property 2A is itself taken as the definition of a confounder, then, on causal 
diagrams, this definition of a confounder also satisfies Property 1. This is 
because if S denotes the set of all nodes C which obey Property 2A and if 
S is not a sufficient adjustment set (so there is open backdoor path vr from 
A to Y), then if we let W be all nondescendants of A other than A and 
noncolliders nodes on vr, if we choose a node K on ir that does not contain 
descendants of A, then it is the case that K satisfies Property 2A, and is 
not a part of 5, which would be a contradiction. 

Although it is the case that if Property 2A is itself taken as the defini- 
tion of a confounder then this definition also satisfies Property 1 on causal 
diagrams, this does not hold generally within a counterfactual framework. 
Note also that, even on causal diagrams, it is not the case that Property 2A 
implies Property 1; a counterexample to this was given in Proposition 3 
for Definition 3 which satisfies Property 2A but not Property 1. Rather, if 
Property 2A is itself taken as the definition of a confounder, then, on causal 
diagrams, this definition would satisfy Property 1 as well. This raises the 
question as to whether Property 2A itself could be taken as the definition 
of a confounder, as such a definition would satisfy Property 2A (by defini- 
tion) and Property 1 on causal diagrams. Although such a definition would 
satisfy Properties 1 and 2A on causal diagrams, it would also follow from 
this definition that Ci is a confounder for the effect of A on y in Figure 1, 
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even though the effect A on y is unconfounded without controlhng for any 
covariates. This is because if Property 2 A is taken as the definition of a 
confounder, then Ci satisfies Property 2A with X taken as C3. In general, 
however, if the effect A on Y is unconfounded without controlhng for any 
covariates, we would probably simply say that there are no confounders for 
the unconditional effect of A on y. 

7. Concluding remarks. The causal inference literature has provided a 
formal definition of confounding with reference to distributions of counter- 
factual outcomes. The literature now rightly emphasizes the concept of con- 
founding control over that of a "confounder." Nonetheless, the word "con- 
founder" is often still used among applied researchers and in this paper we 
have shown that at least one formal counterfactual-based definition coheres 
with the way in which the word is generally used. We have considered a 
number of candidate proposals often arising from more informal statements 
made in the literature. We have considered whether each of these definitions 
satisfies two properties, namely, (i) that on any causal diagram, control for 
all confounders so defined will control for confounding and (ii) any variable 
qualifying as a confounder under this criterion will in some context remove 
confounding. Only one of the definitions considered here satisfied both of 
these two properties. We thus proposed that a pre-exposure covariate C be 
considered a confounder for the effect of A on y if there exists a set of covari- 
ates X such that the effect of the exposure on the outcome is unconfounded 
conditional on {X, C) but for no proper subset of (X, C) is the effect of the 
exposure on the outcome unconfounded given the subset. Equivalently, a 
confounder is a "member of a minimally sufficient adjustment set." This is 
closely related to the definitions concerning confounders given in Robins and 
Morgenstern (1987), though Robins and Morgenstern suggest specifying the 
other variables for which control might be made as well. We have further 
provided a conditional analogue of the proposed definition of a confounder; 
and we have proposed that a variable that helps reduce bias but not elim- 
inate bias be referred to as a "surrogate confounder." The definition of a 
"confounder" above is given rigorously in terms of counterfactuals and, we 
believe, is also in accord with the intuitive properties of a "confounder" im- 
plicitly presupposed by practicing statisticians and epidemiologists. From a 
more theoretical perspective. Definition 4, unlike the other definitions, gives 
rise to elegant and useful results which itself lends further support for its 
being taken as the definition of a confounder. 

APPENDIX 

Review of causal diagrams. A directed graph consists of a set of nodes 
and directed edges among nodes. A path is a sequence of distinct nodes 
connected by edges regardless of arrowhead direction; a directed path is a 
path which follows the edges in the direction indicated by the graph's arrows. 
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A directed graph is acyclic if there is no node with a sequence of directed 
edges back to itself. The nodes with directed edges into a node A are said 
to be the parents of A; the nodes into which there are directed edges from 
A are said to be the children of A. We say that node A is an ancestor of 
node B if there is a directed path from A to i?; if ^4 is an ancestor of B, 
then B is said to be a descendant of A. If X denotes a set of nodes, then 
An(X) will denote the ancestors of X and Nd(X) will denote the set of 
nondescendants of X. For a given graph G, and a set of nodes S, the graph 
Gs denotes a subgraph of G containing only vertices of G in 5 and only 
edges of G between vertices in S. On the other hand, the graph G^ denotes 
the graph obtained from G by removing all edges with arrowheads pointing 
to S. A node is said to be a collider for a particular path if it is such that 
both the preceding and subsequent nodes on the path have directed edges 
going into that node. A path between two nodes, A and B, is said to be 
blocked given some set of nodes G if either there is a variable in C on the 
path that is not a collider for the path or if there is a collider on the path 
such that neither the collider itself nor any of its descendants are in G. For 
disjoint sets of nodes A, B and C, we say that A and B are d-separated 
given G if every path from any node in A to any node in B is blocked 
given G. Directed acyclic graphs are sometimes used as statistical models 
to encode independence relationships among variables represented by the 
nodes on the graph [Lauritzen (1996)]. The variables corresponding to the 
nodes on a graph are said to satisfy the global Markov property for the 
directed acyclic graph (or to have a distribution compatible with the graph) 
if for any disjoint sets of nodes A, B, G we have that A _LL B\G whenever A 
and B are d-separated given G. The distribution of some set of variables V 
on the graph is said to be faithful to the graph if for all disjoint sets A,B,G 
of V we have that A IL B\G only when A and B are d-separated given G. 
Directed acyclic graphs can be interpreted as representing causal rela- 
tionships. Pearl (1995) defined a causal directed acyclic graph as a di- 
rected acyclic graph with nodes {Xi,. . . ,Xn) corresponding to variables 
such that each variable Xi is given by its nonparametric structural equation 
Xi = fi{pai,ei), where pai are the parents of Xi on the graph and the Si are 
mutually independent. For a causal diagram, the nonparametric structural 
equations encode counterfactual relationships among the variables repre- 
sented on the graph. The equations themselves represent one-step ahead 
counterfactuals with other counterfactuals given by recursive substitution 
[see Pearl (2009) for further discussion]. A causal directed acyclic graph 
defined by nonparametric structural equations satisfies the global Markov 
property as stated above [Pearl (2009)]. The requirement that the Si be 
mutually independent is essentially a requirement that there is no variable 
absent from the graph which, if included on the graph, would be a parent 
of two or more variables [Pearl (1995, 2009)]. Throughout we assume the 
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exposure A consists of a single node. A backdoor path from A to y is a 
path to Y which begins with an edge into A. A set of variables X is said 
to satisfy the backdoor path criterion with respect to {A,Y) if no variable 
in X is a descendant of A and if X blocks all backdoor paths from A to 
Y. Pearl (1995) showed that if X satisfies the backdoor path criterion with 
respect to {A,Y), then the effect of ^ on y is unconfounded given X, that 
is, YalLA\X. 

Empirical testing for confounders and confounding. The absence of con- 
founding conditional on a set of covariates S, that is, Ya -LL A\S, is not a 
property that can be tested empirically with data. One must rely on subject 
matter knowledge, which may sometimes take the form of a causal diagram. 
Nonetheless, a few things can be said about empirical testing concerning 
confounding and confounders. For the sake of completeness, we will con- 
sider each of Definitions 1-6. It is possible to verify empirically whether a 
variable is a confounder under Definition 1 since the definition refers to ob- 
served associations; however, it is not possible, without further knowledge, 
to empirically verify that a variable does not satisfy Definition 1 because 
a variable may satisfy Definition 1 for some X that involves an unmea- 
sured variable U. One would have to know that data were available for all 
variables on a causal diagram to empirically verify that a variable was a 
nonconfounder under Definition 1. Because of this, even though Definition 1 
satisfies Property 1 under faithfulness, this cannot be used as an empirical 
test for confounding since (i) we cannot empirically verify that a variable 
is a nonconfounder under Definition 1 and (ii) we cannot empirically verify 
whether faithfulness holds. 

Without further assumptions, we cannot empirically verify that a variable 
is a confounder or a nonconfounder under Definition 2 because Definition 2 
makes reference to backdoor paths. Whether a variable lies on a backdoor 
path cannot be tested empirically without further assumptions; one would 
have to know the structure of the underlying causal diagram. Likewise, for 
Definitions 3 and 4, one would need to know all minimally sufficient adjust- 
ment sets, which itself would require checking the "no confounding" condi- 
tion Ya -LL A\S, which is, as noted above, not empirically testable; though 
see below for some qualifications. For Definition 5, we could empirically re- 
ject the inequality in Definition 5 for observed X if ^^ ^{E{Y\A = 1, x, c) — 
E{Y\A = 0, X, c)} pr(x, c) = Ex{^(^l^ = 1, 2;) - E{Y\A = 0, x)} pr(x). How- 
ever, we cannot empirically reject the inequality in Definition 5 for unob- 
served X and we, moreover, cannot empirically verify the inequality in Def- 
inition 5 because E{Yi) — E{Yq) will not in general be empirically identified 
if there are unobserved variables. We can verify empirically whether a vari- 
able is a confounder under Definition 6 since the definition refers to only 
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observed variables; however, it is not possible, without further knowledge, 
to empirically verify that a variable does not satisfy Definition 6 because 
a variable may satisfy Definition 6 for some X that involves an unmea- 
sured variable U . One would have to know that data were available for all 
variables on a causal diagram to empirically verify that a variable was a 
nonconfounder under Definition 6. Because of this we cannot empirically 
verify that a variable is a nonconfounder under Definition 6. 

Determining whether a variable is a confounder requires making untestable 
assumptions. The only real progress that can be made with empirical test- 
ing for confounders is by making other untestable assumptions that logically 
imply a test for assumptions we care about. For example, suppose we as- 
sume we have some set S that we are sure constitutes a sufficient adjustment 
set. In this case, we can sometimes remove variables as unnecessary for con- 
founding control. In particular, Robins (1997) showed that if we knew that 
for covariate sets Si and ^2 we had that Ya -LL A\[Si,S2), then we would also 
have that y^ _LLyl|S'i if 5*2 can be decomposed into two disjoint subsets Ti 
and T2 such that A l]-Ti\Si and y _LL T2|^,S'i,ri. Both of these latter con- 
ditions are empirically testable. Geng et al. (2001) provide some analogous 
results for the effect of exposure on the exposed. VanderWeele and Shpitser 
(2011) note that if for covariate set S we have that Ya -LL ^15", then if a back- 
ward selection procedure is applied to S such that variables are iteratively 
discarded that are independent of Y conditional on both exposure A and 
the members of S that have not yet been discarded, then the resulting set 
of covariates will suffice for confounding control. They also show that under 
an additional assumption of faithfulness, if, for covariate set 5, we have that 
y^j _LL ^liS, then if a forward selection procedure is applied to S such that, 
starting with the empty set, variables are iteratively added which are asso- 
ciated with y conditional on both exposure A and the variables that have 
already been added, then the resulting set of covariates will suffice for con- 
founding control. Note, however, all of these results require knowledge that 
for some set S", 1^ _LL A\S, which is not itself empirically testable without 
experimental interventions. 
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